Skip to content

Vsvo embedded bdf#3391

Open
KeshavVenkatesh wants to merge 4 commits intoSciML:masterfrom
KeshavVenkatesh:vsvo-embedded-bdf
Open

Vsvo embedded bdf#3391
KeshavVenkatesh wants to merge 4 commits intoSciML:masterfrom
KeshavVenkatesh:vsvo-embedded-bdf

Conversation

@KeshavVenkatesh
Copy link
Copy Markdown
Contributor

Checklist

  • [Yes ] Appropriate tests were added
  • [ No] Any code changes were done in a way that does not break public API
  • [ Yes] All documentation related to code changes were updated
  • [ Yes] The new code follows the
    contributor guidelines, in particular the SciML Style Guide and
    COLPRAC.
  • Any new documentation only uses public API

Additional context

This PR contains changes to stabiity filter for backward euler and also VSVO implementation for BDF based on the paper here
Add any other context about the problem here.

Sorry for the delay it took me a whie to fully understand the paper and make sure the implementation matches with the algorithm defined in the paper

@ChrisRackauckas : Please review and provide feedback. I did rebase this time too.

@oscardssmith
Copy link
Copy Markdown
Member

How do the benchmarks look for this?

@KeshavVenkatesh
Copy link
Copy Markdown
Contributor Author

@oscardssmith @ChrisRackauckas
I did a benchmark test comparing MOOSE234 and FBDF mehtods. Even though MOOSE234 takes fewer steps on stiff problems, each step requires more Newton iterations (f-evals), so the total cost is higher. I tried an adaptive predictor degree approach but it did not close the gap. I'm investigating further and open to suggestions.

@ChrisRackauckas
Copy link
Copy Markdown
Member

Show the work precision diagrams.

@KeshavVenkatesh
Copy link
Copy Markdown
Contributor Author

OOSE234 vs FBDF: Findings

Looking into why MOOSE234 uses more f-evals than FBDF. Here's what came out of
the analysis.

How I got here

Started by running work-precision diagrams (error vs f-evals) on VdP, ROBER,
and HIRES comparing MOOSE234 against FBDF, QNDF, CVODE_BDF, Rodas5P, RadauIIA5,
and KenCarp4. The plots showed MOOSE234's curve consistently shifted to the right
of FBDF — same accuracy but more work.

Since both solvers use the exact same Newton solver (nlsolve!), same convergence
logic, same maxiters=10, same kappa=1/100, the difference had to come from what
each method feeds into Newton or does around it. I wrote a step-by-step
diagnostic (newton_diagnostics.jl) that hooks into init()/step!() and logs
per-step Newton iterations, f-evals, Jacobian updates, and order for both solvers.

That trace showed two clear patterns:

  • MOOSE234: nf per step = newton_iter + 1 (consistently)
  • FBDF: nf per step = newton_iter (consistently)

So every single step, MOOSE234 was spending one extra f-eval that FBDF wasn't.
Traced it to the f(fsallast) call at the end of perform_step!, then confirmed
that nothing reads that value (isfsal=false, interpolation uses Chebyshev nodes).

After fixing that, re-ran the work-precision plots and the per-step traces
across all four problems to confirm the improvement and understand what gap
remains.

What was wrong

MOOSE234 was calling f(fsallast) after every step that was redundant.
FBDF skips this by recovering f(u) algebraically from the BDF equation.
Removed the dead call — saves exactly 1 f-eval per accepted step.

Numbers (MOOSE234 after fix vs FBDF)

Column key:
tol = solver tolerance (abstol = reltol = this value)
MOOSE234 = total function evaluations (nf) for MOOSE234
FBDF = total function evaluations (nf) for FBDF
Ratio = MOOSE234 nf / FBDF nf (lower is better for MOOSE234)
M steps = accepted steps taken by MOOSE234
F steps = accepted steps taken by FBDF

VdP (mu=1000):
tol MOOSE234 FBDF Ratio M steps F steps
1e-4 43 32 1.34x 24 21
1e-6 82 56 1.46x 37 43
1e-8 213 99 2.15x 71 75
1e-10 246 175 1.41x 194 156

(Before the fix, VdP at 1e-6 was 122 nf. Now 82. That's a 33% drop.)

ROBER:
tol MOOSE234 FBDF Ratio M steps F steps
1e-6 1142 569 2.01x 193 217
1e-8 5778 958 6.03x 785 495
1e-10 5128 1899 2.70x 3527 1184

HIRES (8 eqns):
tol MOOSE234 FBDF Ratio M steps F steps
1e-4 406 420 0.97x 47 85
1e-6 996 618 1.61x 159 169
1e-8 2017 1064 1.90x 640 385
1e-10 7210 2097 3.44x 2923 854

Linear stiff:
tol MOOSE234 FBDF Ratio M steps F steps
1e-4 128 80 1.60x 41 55
1e-6 208 135 1.54x 116 102
1e-8 525 299 1.76x 438 224
1e-10 2009 675 2.98x 1918 503

Why MOOSE234 is still more expensive after the fix

Three things:

  1. FBDF goes up to order 5, MOOSE234 caps at 4. At tight tolerances this really
    shows — FBDF takes way fewer steps (ROBER at 1e-10: 1184 steps vs 3527).

  2. On VdP at loose tolerances, MOOSE234 gets stuck at order 2. At tol=1e-6 it's
    order 2 for 97% of steps. The other problems are fine — ROBER/HIRES/Linear
    all reach order 4 without issues. The problem is the order controller needs
    est3 < est2 to try order 4, and on VdP the FBDF4 filter correction is always
    bigger than the BDF3-Stab correction, so it never tries.

    Order 3 is basically never picked on any problem either — the method jumps
    straight from 2 to 4:

    Problem Ord 2 Ord 3 Ord 4 (at tol=1e-6)
    VdP 97% 0% 3%
    ROBER 10% 0% 90%
    HIRES 16% 1% 83%
    Linear 31% 1% 68%

  3. More Newton iterations in VdP's transient region. MOOSE234 has 5 steps needing
    4-6 Newton iters (t < 0.005) vs FBDF's 1 step with 3 iters. Being stuck at
    order 2 means a worse predictor, so Newton has to work harder.

What I am looking at next

  • Compute Est4 properly so the order controller can pick order 4 on VdP. The
    paper has some details I want to review.

  • Figure out why order 3 is never selected. Might be a problem with how the
    error estimators are set up.

  • MOOSE234 rejects a lot more steps than FBDF (HIRES at 1e-10: 109 vs 21).
    The step size controller might be too aggressive.

Work-precision plots (updated with fix)

Re-ran the full benchmark suite (7 solvers, tolerances 1e-4 to 1e-10) on VdP,
ROBER, and HIRES after the fix. Plots are in test/benchmark_results/.

VdP: MOOSE234 tracks close to FBDF at loose tolerances but falls behind at tight
tolerances where FBDF's order-5 capability kicks in. CVODE_BDF and Radau are the
most efficient on this problem overall.

ROBER: MOOSE234 is the most expensive solver across the board. Its curve is far
to the right of everything else, especially at tight tolerances. This is the
worst-case problem for MOOSE234.

HIRES: MOOSE234 is competitive at loose tolerances — its curve overlaps with
FBDF and QNDF around 1e-4 to 1e-5 error. The gap grows at tight tolerances
but stays smaller than on ROBER. This is the closest to a "win" for MOOSE234
among the problems tested.

Across all three problems, the nf-based work-precision plots confirm that the
f(fsallast) fix helped (curves shifted left compared to pre-fix runs) but
MOOSE234 still has structural disadvantages vs FBDF from the order cap and
order controller behavior described above.

hires_work_precision_nf hires_work_precision_nsolve rober_work_precision_nf rober_work_precision_nsolve vanderpol_work_precision_nf vanderpol_work_precision_nsolve

@KeshavVenkatesh
Copy link
Copy Markdown
Contributor Author

cc @oscardssmith @ChrisRackauckas : Please review and provide suggestions. i plan to spend more time over the weekend to dive deep.

@KeshavVenkatesh
Copy link
Copy Markdown
Contributor Author

Since it is a variable order, u_history might be corrupted with incorrect values causing estimate for est3 and est2 to be error prone. I am looking at storing the raw values and computing everytime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants